Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🏠 Local LLM Deployment
Model Optimization, GPU Acceleration, Inference, Privacy
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
8291
posts in
68.8
ms
llama.cpp
guide - Running LLMs
locally
, on any hardware, from scratch
blog.steelph0enix.dev
·
6h
🖥️
Self-hosted apps
Zero-Latency
Local AI:
Tuning
Your Linux Kernel for LLM Inference 🐧🧠
dev.to
·
1d
·
Discuss:
DEV
🖥️
Self-hosted apps
LlamaLib
: A cross-platform C++/C# library for local LLMs based on
llama.cpp
github.com
·
2d
·
Discuss:
Hacker News
🗃️
SQLite
Main
Content ||
Math
∩ Programming
jeremykun.com
·
12h
🗃️
SQLite
Optimized
LLM Inference
Engines
rishirajacharya.com
·
4d
🗃️
SQLite
From Prediction to
Compilation
: A Manifesto for
Intrinsically
Reliable AI
news.ycombinator.com
·
22h
·
Discuss:
Hacker News
🗃️
SQLite
Show HN:
Molinar
– Open-source alternative to ai.com (
AGPL-3.0
)
business.molinar.ai
·
3h
·
Discuss:
Hacker News
🖥️
Self-hosted apps
SDFP
: Speculative Decoding with
FIT-Pruned
Models for Training-Free and Plug-and-Play LLM Acceleration
arxiv.org
·
3d
🗃️
SQLite
How I
squeezed
a
BERT
sentiment analyzer into 1GB RAM on a $5 VPS
mohammedeabdelaziz.github.io
·
1d
·
Discuss:
Hacker News
🗃️
SQLite
ML-LIB
: Machine Learning Library Proposed For The Linux Kernel
phoronix.com
·
2d
·
Discuss:
Hacker News
🖥️
Self-hosted apps
Circumstantial
Complexity
, LLMs and Large Scale Architecture
datagubbe.se
·
1d
·
Discuss:
Lobsters
,
Hacker News
🖥️
Self-hosted apps
Hitting
1,000
tokens
per second on a single RTX 5090
blog.alpindale.net
·
11h
·
Discuss:
Hacker News
🖥
Home Lab Setup
Study: Platforms that
rank
the latest LLMs can be
unreliable
news.mit.edu
·
5h
⭐
Awesome lists
hanig/engram
: Personal knowledge graph and automation system
github.com
·
1d
🖥️
Self-hosted apps
Concurrent
vs.
Parallel
Execution in LLM API Calls: From an AI Engineer’s Perspective
pub.towardsai.net
·
4h
🖥️
Self-hosted apps
Understanding LLM Inference
Engines
: Inside
Nano-vLLM
(Part 2)
neutree.ai
·
2d
·
Discuss:
Hacker News
🗃️
SQLite
Finding the needle in the
logstack
: Reducing LLM context with
TF-IDF
eliseomartelli.it
·
3d
🗃️
SQLite
I Let AI Agents Train Their Own Models. Here's What Actually
Happened
.
hamzamostafa.com
·
6h
·
Discuss:
Hacker News
🖥️
Self-hosted apps
Unlocking core memories with
GoldSrc
engine and
CS
1.6 (2025)
danielbrendel.com
·
22h
·
Discuss:
Hacker News
🗃️
SQLite
MCP
multiplexer
that cuts agent context
usage
by 95%
mcplexor.com
·
3h
·
Discuss:
Hacker News
🖥️
Self-hosted apps
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help